894 research outputs found

    Text Document Classification: An Approach Based on Indexing

    Get PDF
    ABSTRACT In this paper we propose a new method of classifying text documents. Unlike conventional vector space models, the proposed method preserves the sequence of term occurrence in a document. The term sequence is effectively preserved with the help of a novel datastructure called ‘Status Matrix’. Further the corresponding classification technique has been proposed for efficient classification of text documents. In addition, in order to avoid sequential matching during classification, we propose to index the terms in Btree, an efficient index scheme. Each term in B-tree is associated with a list of class labels of those documents which contain the term. Further the corresponding classification technique has been proposed. To corroborate the efficacy of the proposed representation and status matrix based classification, we have conducted extensive experiments on various datasets. Original Source URL : http://aircconline.com/ijdkp/V2N1/2112ijdkp04.pdf For more details : http://airccse.org/journal/ijdkp/vol2.htm

    A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents

    Get PDF
    Selection of highly discriminative feature in text document plays a major challenging role in categorization. Feature selection is an important task that involves dimensionality reduction of feature matrix, which in turn enhances the performance of categorization. This article presents a new feature selection method based on Intuitionistic Fuzzy Entropy (IFE) for Text Categorization. Firstly, Intuitionistic Fuzzy C-Means (IFCM) clustering method is employed to compute the intuitionistic membership values. The computed intuitionistic membership values are used to estimate intuitionistic fuzzy entropy via Match degree. Further, features with lower entropy values are selected to categorize the text documents. To find the efficacy of the proposed method, experiments are conducted on three standard benchmark datasets using three classifiers. F-measure is used to assess the performance of the classifiers. The proposed method shows impressive results as compared to other well known feature selection methods. Moreover, Intuitionistic Fuzzy Set (IFS) property addresses the uncertainty limitations of traditional fuzzy set

    Automatic Irony Detection using Feature Fusion and Ensemble Classifier

    Get PDF
    With the advent of micro-blogging sites, users are pioneer in expressing their sentiments and emotions on global issues through text. Automatic detection and classification of sentiments like sarcastic or ironic content in microblogging reviews is a challenging task. It requires a system that manages some kind of knowledge to interpret the sentiment expressed in text. The available approaches are quite limited in their capabilities and scope to detect ironic utterances present in the text. In this regards, the paper propose feature fusion to provide knowledge to the system by alternative sets of features obtained using linguistic and content based text features. The proposed work extracts five sets of linguistic features and fuses with features selected using two stages of a feature selection method. In order to demonstrate the effectiveness of the proposed method, we conduct extensive experimentation by selecting different feature subsets. The performances of the proposed method are evaluated using Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT) and ensemble classifiers. The experimental result shows the proposed approach significantly out-performs the conventional methods

    Anomaly based Intrusion Detection using Modified Fuzzy Clustering

    Get PDF
    This paper presents a network anomaly detection method based on fuzzy clustering. Computer security has become an increasingly vital field in computer science in response to the proliferation of private sensitive information. As a result, Intrusion Detection System has become an indispensable component of computer security. The proposed method consists of three steps: Pre-Processing, Feature Selection and Clustering. In pre-processing step, the duplicate samples are eliminated from the sample set. Next, principal component analysis is adopted to select the most discriminative features. In clustering step, the network samples are clustered using Robust Spatial Kernel Fuzzy C-Means (RSKFCM) algorithm. RSKFCM is a variant of traditional Fuzzy C-Means which considers the neighbourhood membership information and uses kernel distance metric. To evaluate the proposed method, we conducted experiments on standard dataset and compared the results with state-of-the-art methods. We used cluster validity indices, accuracy and false positive rate as performance metrics. Experimental results inferred that, the proposed method achieves better results compared to other methods

    Automated ECG Analysis for Localizing Thrombus in Culprit Artery Using Rule Based Information Fuzzy Network

    Get PDF
    Cardio-vascular diseases are one of the foremost causes of mortality in today’s world. The prognosis for cardiovascular diseases is usually done by ECG signal, which is a simple 12-lead Electrocardiogram (ECG) that gives complete information about the function of the heart including the amplitude and time interval of P-QRST-U segment. This article recommends a novel approach to identify the location of thrombus in culprit artery using the Information Fuzzy Network (IFN). Information Fuzzy Network, being a supervised machine learning technique, takes known evidences based on rules to create a predicted classification model with thrombus location obtained from the vast input ECG data. These rules are well-defined procedures for selecting hypothesis that best fits a set of observations. Results illustrate that the recommended approach yields an accurateness of 92.30%. This novel approach is shown to be a viable ECG analysis approach for identifying the culprit artery and thus localizing the thrombus

    A Convolution Neural Network Engine for Sclera Recognition

    Get PDF
    The world is shifting to the digital era in an enormous pace. This rise in the digital technology has created plenty of applications in the digital space, which demands a secured environment for transacting and authenticating the genuineness of end users. Biometric systems and its applications has seen great potentials in its usability in the tech industries. Among various biometric traits, sclera trait is attracting researchers from experimenting and exploring its characteristics for recognition systems. This paper, which is first of its kind, explores the power of Convolution Neural Network (CNN) for sclera recognition by developing a neural model that trains its neural engine for a recognition system. To do so, the proposed work uses the standard benchmark dataset called Sclera Segmentation and Recognition Benchmarking Competition (SSRBC 2015) dataset, which comprises of 734 images which are captured at different viewing angles from 30 different classes. The proposed methodology results showcases the potential of neural learning towards sclera recognition system

    Multilayer Feedforward Neural Network for Internet Traffic Classification

    Get PDF
    Recently, the efficient internet traffic classification has gained attention in order to improve service quality in IP networks. But the problem with the existing solutions is to handle the imbalanced dataset which has high uneven distribution of flows between the classes. In this paper, we propose a multilayer feedforward neural network architecture to handle the high imbalanced dataset. In the proposed model, we used a variation of multilayer perceptron with 4 hidden layers (called as mountain mirror networks) which does the feature transformation effectively. To check the efficacy of the proposed model, we used Cambridge dataset which consists of 248 features spread across 10 classes. Experimentation is carried out for two variants of the same dataset which is a standard one and a derived subset. The proposed model achieved an accuracy of 99.08% for highly imbalanced dataset (standard)

    Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method

    Get PDF
    Social Networking sites have become popular and common places for sharing wide range of emotions through short texts. These emotions include happiness, sadness, anxiety, fear, etc. Analyzing short texts helps in identifying the sentiment expressed by the crowd. Sentiment Analysis on IMDb movie reviews identifies the overall sentiment or opinion expressed by a reviewer towards a movie. Many researchers are working on pruning the sentiment analysis model that clearly identifies and distinguishes between a positive review and a negative review. In the proposed work, we show that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results both in terms of accuracy and complexity when tested against classifiers like SVM, Naïve Bayes, KNN and Maximum Entropy. The proposed model clearly differentiates between a positive review and negative review. Since understanding the context of the reviews plays an important role in classification, using hybrid features helps in capturing the context of the movie reviews and hence increases the accuracy of classification

    Functional outcome of middle third humeral shaft fractures treated with anteromedial plate osteosynthesis through an anterolateral approach

    Get PDF
    Background: The main aim of treatment of the humeral shaft fractures is to establish union with an acceptable humeral alignment and to restore the patient to pre-injury level of function. Plate osteosynthesis remains the standard of surgical treatment displaced middle third humeral fractures. The most commonly used approaches for treating these fractures are posterior and anterolateral, but these approaches can have iatrogenic radial nerve injury. Our aim is to study the incidence of radial nerve palsy and functional outcome of anterolateral approach with anteromedial plating.Methods: A total of 26 patients in the age group of 21 to 62 years were included in this prospective study, who were treated by anteromedial plating through anterolateral approach for humerus shaft. Functional assessment was done using Rodriguez-Merchan criteria.Results: 26 patients with shaft humerus fracture were included in the study with 19 (73%) patients were less than 40 years age. Most common type of fracture pattern is A3 type and the mean duration of surgical time was 60±10 min for anteromedial plating. The time taken for the fracture union was less than 4 months in the most patients (88%). There was no evidence of iatrogenic radial nerve injury. Functional assessment done using Rodriguez-Merchan criteria showed 84.6% of the patients had good to excellent functional outcome.Conclusions: For treatment of displaced middle third humeral fractures open reduction with anteromedial plating through anterolateral approach is surgically safer and gives better functional outcome.
    • …
    corecore